Vector spaces for historical linguistics: Using distributional semantics to study syntactic productivity in diachrony
نویسنده
چکیده
This paper describes an application of distributional semantics to the study of syntactic productivity in diachrony, i.e., the property of grammatical constructions to attract new lexical items over time. By providing an empirical measure of semantic similarity between words derived from lexical co-occurrences, distributional semantics not only reliably captures how the verbs in the distribution of a construction are related, but also enables the use of visualization techniques and statistical modeling to analyze the semantic development of a construction over time and identify the semantic determinants of syntactic productivity in naturally occurring data.
منابع مشابه
Using distributional semantics to study syntactic productivity in diachrony: A case study
This paper investigates syntactic productivity in diachrony with a data-driven approach. Previous research indicates that syntactic productivity (the property of grammatical constructions to attract new lexical fillers) is largely driven by semantics, which calls for an operationalization of lexical meaning in the context of empirical studies. It is suggested that distributional semantics can f...
متن کاملCategory-theoretic quantitative compositional distributional models of natural language semantics
This thesis is about the problem of compositionality in distributional semantics. Distributional semantics presupposes that the meanings of words are a function of their occurrences in textual contexts. It models words as distributions over these contexts and represents them as vectors in high dimensional spaces. The problem of compositionality for such models concerns itself with how to produc...
متن کاملA Frobenius Model of Information Structure in Categorical Compositional Distributional Semantics
The categorical compositional distributional model of Coecke et al. (2010) provides a linguistically motivated procedure for computing the meaning of a sentence as a function of the distributional meaning of the words therein. The theoretical framework allows for reasoning about compositional aspects of language and offers structural ways of studying the underlying relationships. While the mode...
متن کاملDerivational Smoothing for Syntactic Distributional Semantics
Syntax-based vector spaces are used widely in lexical semantics and are more versatile than word-based spaces (Baroni and Lenci, 2010). However, they are also sparse, with resulting reliability and coverage problems. We address this problem by derivational smoothing, which uses knowledge about derivationally related words (oldish→ old) to improve semantic similarity estimates. We develop a set ...
متن کاملCoordination in Categorical Compositional Distributional Semantics
An open problem with categorical compositional distributional semantics is the representation of words that are considered semantically vacuous from a distributional perspective, such as determiners, prepositions, relative pronouns or coordinators. This paper deals with the topic of coordination between identical syntactic types, which accounts for the majority of coordination cases in language...
متن کامل